Composing Monadic Queries in Trees
نویسندگان
چکیده
Node selection in trees is a fundamental operation to XML databases, programming languages, and information extraction. We propose a new class of querying languages to define n-ary node selection queries as compositions of monadic queries. The choice of the underlying monadic querying language is parametric. We show that compositions of monadic MSO-definable queries capture n-ary MSO-definable queries, and distinguish an MSO-complete n-ary query language that enjoys an efficient query answering algorithm.
منابع مشابه
A note on monadic datalog on unranked trees
In the article Recursive queries on trees and data trees (ICDT’13), Abiteboul et al. asked whether the containment problem for monadic datalog over unordered unranked labeled trees using the child relation and the descendant relation is decidable. This note gives a positive answer to this question, as well as an overview of the relative expressive power of monadic datalog on various representat...
متن کاملQuerying Unranked Trees with Stepwise Tree Automata
The problem of selecting nodes in unranked trees is the most basic querying problem for XML. We propose stepwise tree automata for querying unranked trees. Stepwise tree automata can express the same monadic queries as monadic Datalog and monadic second-order logic. We prove this result by reduction to the ranked case, via a new systematic correspondence that relates unranked and ranked queries.
متن کاملInteractive Learning of Node Selecting Tree Transducers⋆
We develop new algorithms for learning monadic node selection queries in unranked trees from annotated examples, and apply them to visually interactive Web information extraction. We propose to represent monadic queries by bottom-up deterministic Node Selecting Tree Transducers (Nstts), a particular class of tree au-tomata that we introduce. We prove that deterministic Nstts capture the class o...
متن کاملLearning Monadic Queries for Semi-Structured Documents from Positive Examples
Querying for nodes in trees is a core operation for information extraction from semi-structured documents in XML or HTML. We show that regular monadic queries for nodes in trees can be identified from positive examples, and this in polynomial time when represented by deterministic node selecting transducers that we introduce.
متن کاملSchema-Guided Induction of Monadic Queries
The induction of monadic node selecting queries from partially annotated XML-trees is a key task in Web information extraction. We show how to integrate schema guidance into an RPNI-based learning algorithm, in which monadic queries are represented by pruning node selecting tree transducers. We present experimental results on schema guidance by the DTD of HTML.
متن کامل